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Abstract — In this paper, one studies the famous well-known and challenging Tweety Penguin Triangle Problem (TPTP or TP2) 
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1 Introduction 


Judea Pearl claimed that DST of evidence fails to provide a reasonable solution for the combination of evidence even 
for apparently very simple fusion problem [B]. Most criticisms are answered by Philippe Smets in [23] [24]. The 
Tweety Penguin Triangle Problem (TP2) is one of the typical exciting and challenging problem for all theories managing 
uncertainty and conflict because it shows the real difficulty to maintain truth for automatic reasoning systems when the 
classical property of transitivity (which is basic to the material-implication) does not hold. In his book [12], Judea Pearl 
presents and discusses in details the semantic clash between Bayes vs. Dempster-Shafer reasoning. We present here our 
new analysis on this problem and provide a solution of the Tweety Penguin Triangle Problem based on our new theory of 
plausible and paradoxical reasoning, known as DSmT (Dezert-Smarandache Theory). We show how this problem can be 
attacked and solved by our new reasoning with help of the (hybrid) DSm rule of combination ; 


The purpose of this paper is not to browse all approaches available in literature for attacking the TP2 problem but 
only to provide a comparison of the DSm reasoning with respect to the Bayesian reasoning and to the plausible reasoning 
of DST framework. Interesting but complex analysis on this problem based on default reasoning and e-belief functions 
can be also found by example in and [I]. Other interesting and promising issues for the TP2 problem based on the 
fuzzy logic of Zadeh jointly with the theory of possibilities (6[7] are under investigations. Some theoretical research 
works on new conditional event algebras (CEA) have emerged in literature [8] since last years and could offer a new track 
for attacking the TP2 problem although unfortunately no clear didactic, simple and convincing examples are provided to 
show the real efficiency and usefulness of these theoretical investigations. 


2 The Tweety Penguin Triangle Problem 


This very important and challenging problem, as known as the Tweety Penguin Triangle Problem (TP2) in literature, is 
presented in details by Judea Pearl in [I2]. We briefly present here the TP2 and the solutions based first on fallacious 
Bayesian reasoning and then on the Dempster-Shafer reasoning. We will then focus our analysis of this problem from the 
DSmT framework and the DSm reasoning. 


Let’s consider the set R = {r1, r2, r3 } of given rules: 
e rı: ’Penguins normally don’t fly” = (p — ~f) 
e r2: ” Birds normally fly” = (b > f) 


e r3: ” Penguins are birds” = (p — b) 


To emphasize our strong conviction in these rules we commit them some high confidence weights w 1, w2 and ws in [0, 1] 
with wı = 1 — €1, W2 = 1 — €g and w3 = 1 (where e; and €2 are small positive quantities). The conviction in these rules 
is then represented by the set W = {w1, w2, w3} in the sequel. 


Another useful and general notation adopted by Judea Pearl in the first pages of his book to characterize these 
three weighted rules is the following one (where w1, w2, w3 € [0, 1): 


nip 3f) re:b3f re:p3b 


When w1, w2,w3 € {0,1} the classical logic is the perfect tool to conclude on the truth or on the falsity of a propo- 
sition built from these rules based on the standard propositional calculus mainly with its three fundamental rules (Modus 
Ponens, Modus Tollens and Modus Barbara - i.e. transitivity rule). When 0 < w 1, we, w3 < 1, the classical logic can’t be 
applied because the Modus Ponens, the Modus Tollens and the Modus Barbara do not longer hold and some other tools 
must be chosen. This will discussed in detail in sectionB.2] 


Question: Assume we observe an animal called Tweety (T) that is categorically classified as a bird (b) and a penguin (p), 
i.e. our observation is O = [T = (b N p)] = [(T = b) N (T = p)]. The notation T = (b N p) stands here for ”Entity T 
holds property (b N p)”. What is the belief (or the probability - if such probability exists) that Tweety can fly given the 
observation O and all information available in our knowledge base (i.e. our rule-based system R and W) ? 


The difficulty of this problem for most of artificial reasoning systems (ARS) comes from the fact that, in this example, 
the property of transitivity, usually supposed satisfied from material-implication interpretation [12], (p — b,b > f) > 
(p — f) does not hold here (see sectionB.2). In this interesting example, the classical property of inheritance is thus 
broken. Nevertheless a powerful artificial reasoning system must be able to deal with such kind of difficult problem and 
must provide a reliable conclusion by a general mechanism of reasoning whatever the values of convictions are (not only 
restricted to values close to either 0 or 1). We examine now three ARS based on the Bayesian reasoning which turns 
to be fallacious and actually not appropriate for this problem and we explain why, on the Dempster-Shafer Theory (DST) 
[17] and on the Dezert-Smarandache Theory (DSmT) BII. 


3 The fallacious Bayesian reasoning 


We first present the fallacious Bayesian reasoning solution drawn from the J. Pearl’s book in [I2] (pages 447-449) and 
then we explain why the solution which seems at the first glance correct with intuition is really fallacious. We then explain 
why the common rational intuition turns actually to be wrong. 


3.1 The Pearl’s analysis 


To preserve mathematical rigor, we introduce explicitly all information available in the derivations. In other words, one 
wants to evaluate using the Bayesian reasoning, the conditional probability, if it exists, P(T = f|O,R,W) = P(T = 
f\T = p,T = b,R,W). The Pearl’s analysis is based on the assumption that a conviction on a given rule can be 
interpreted as a conditional probability (see page 4). In other words if one has a given rule a > b with w € (0, 1] 
then one can interpret, at least for the calculus, w as P(b|a) and thus the probability theory and Bayesian reasoning can 
help to answer to the question. We prove in the following section that such model cannot be reasonably adopted. For now, 
we just assume that such probabilistic model holds effectively as Judea Pearl does. Based on this assumption, since the 
conditional term/information (T = p, T = b, R, W) is strictly equivalent to (T = p, R, W) because of the knowledge of 
rule r3 with certainty (since w3 = 1), one gets easily the fallacious intuitive expected Pearl’s result: 








P(T = f|O,R,W) = P(T = f|T = p,T = b, R,W) 
P(T = f|O,R,W) = P(T = f|T = p, R,W) 

P(T = f|O,R,W) = 1 — P(T = ~f|T =p, R,W) 
P(T = f|O,R,W) = 1- w = 6&6 


From this simple analysis, the Tweety’s ”birdness” does not render her a better flyer than an ordinary penguin as intuitively 
expected and the probability that Tweety can fly remains very low which looks normal. We reemphasize here the fact, that 
in his Bayesian reasoning J. Pearl assumes that the weight w; for the conviction in rule rı can be interpreted in term of a 
real probability measure P(—f |p). This assumption is necessary to provide the rigorous derivation of P(T = f|O, R, W). 
It turns out however that convictions w; on logical rules cannot be interpreted in terms of probabilities as we will prove in 
the next section. 


When rule r3 is not asserted with absolute certainty (i.e. w3 = 1) but is subject to exceptions, i.e. w3 = 1 — €3 < 1, 
the fallacious Bayesian reasoning yields (where notations T = f,T = band T = pare replaced by f, b and p for notation 
convenience): 


P bR, W 
PUNO) = EAEN 
BGO nme Lo OW) 


P(b|p, R, W)P(p|R, W) 


By assuming P(p|R, W) > 0, one gets after simplification by P(p|R, W) 





P(f, lp, R, W) 
P(f|O, R, W) = ————— 
(r10, R W) = Sp.R,W) 
P(b| f, p, R,W)P(f|p, R,W) 
P(f|O, R, W) = ——— 1 
(f| 3 2 ) P(blp, R, W) 
If one assumes P(b|p, R, W) = w3 = 1 — €3 and P(f|p, RW) = 1 — P(-f\|p, R, W) = 1 — w = «, one gets 
€ 
P(f|O, R,W) = POM,p, R,W) x 7 


Because 0 < P(b| f, p, R, W) < 1, one finally gets the Pearl’s result (p.448) 


€1 





P(f|O,R,W) < (1) 


1— €3 

which states that the observed animal Tweety (a penguin-bird) has a very small probability of flying as long as e3 re- 
mains small, regardless of how many birds cannot fly (€2), and has consequently a high probability of not flying because 
P(f\|O, R,W) + P(f|O,R,W) = 1 since the events f and f are mutually exclusive and exhaustive (assuming that the 
Pearl’s probabilistic model holds ... ). 


3.2 The weakness of the Pearl’s analysis 


We prove now that the previous Bayesian reasoning is really fallacious and the problem is truly undecidable to conclude 
about the ability of Tweety to fly or not to fly if a deep analysis is done. Actually, the Bayes’ inference is not a classical 
inference [3]. Indeed, before applying blindly the Bayesian reasoning as in the previous section, one first has to check 
that the probabilistic model is well-founded to characterize the convictions of the rules of the rule-based system under 
analysis. We prove here that such probabilistic model doesn’t hold for a suitable and useful representation of the problem 
and consequently for any problems based on the weighting of logical rules (with positive weighting factors/convictions 
below than 1). 


3.2.1 Preliminaries 


We just remind here only few important principles of the propositional calculus of the classical Mathematical Logic which 
will be used in our demonstration. A simple notation, which may appear as unusual for logicians, is adopted here just for 
convenience. A detailed presentation of the propositional calculus and Mathematical Logic can be easily found in many 
standard mathematical textbooks like (16) [11] {10). Here are these important principles: 


e Third middle excluded principle : A logical variable is either true or false, i.e. 


aV ~a (2) 


e Non-contradiction law : A logical variable can’t be both true and false, i.e. 


a(a A 7a) (3) 


e Modus Ponens : This rule of the propositional calculus states that if a logical variable a is true and a — b is true, 
then b is true (syllogism principle), i.e. 
(a^ (a—b)) 3b (4) 


e Modus Tollens : This rule of the propositional calculus states that if a logical variable =b is true and a — bis true, 
then ~a is true, i.e. 
(b A (a > b)) > ~a (5) 


e Modus Barbara : This rule of the propositional calculus states that if a — b is true and b — cis true then a — c 
is true (transitivity property), i.e. 
((a = b) ^ (b > c)) > (ac) (6) 


From these principles, one can prove easily, based on the truth table method, the following property (more general 
deducibility theorems in Mathematical Logic can be found in [19] [20}) : 


((a => b) A (e > d)) > ((a A c) > (b A^ d)) (7) 


3.2.2 Analysis of the problem when €, = €2 = €3 = 0 


We first examine the TP2 when one has no doubt in the rules of our given rule-based systems, i.e. 


From rules rı and rz and because of property (7), one concludes that 
pAb— (fA-f) 
and using the non-contradiction law @) with the Modus Tollens (5), one finally gets 
Af Anf) + (pAb) 


which proves that p A^ b is always false whatever the rule r3 is. Interpreted in terms of the probability theory, the event 
T = p N b corresponds actually and truly to the impossible event ) since T = f and T = f are exclusive and exhaustive 
events. Under such conditions, the analysis proves the non-existence of the penguin-bird Tweety. 


If one adopts the notations! of the probability theory, trying to derive P(T = f|T = pN b) and P(T = f|T = pnb) 
with the Bayesian reasoning is just impossible because from one of the axioms of the probability theory, one must have 
P(@) = 0 and from the conditioning rule, one would get expressly for this problem the indeterminate expressions: 








P(T = f|T = pnb) = P(T = f|T = 9) 


P(T = f|T = pNb) 





























and similarly 



































P(T = f|T = pnb) = P(T = fIT = 9) 

PT = jir = p00) = PELA 

P(T = fiT =pNb) an 

P(T = f|T =pnb) i (indeterminate) 


3.2.3 Analysis of the problem when 0 < €1, €2,€3 < 1 


Let’s examine now the general case when one allows some little doubt on the rules characterized by taking €, = 0, €2 Z 0 
and €3 = 0 and examine the consequences on the probabilistic model on these rules. 


‘Because probabilities are related to sets, we use here the common set-complement notation f instead of the logical negation 
notation =f, N for A and U for V if necessary. 


First note that, because of the third middle excluded principle and the assumption of the existence of a probabilistic 
model for a weighted rule, then one should be able to consider simultaneously both ’probabilistic/Bayesian” rules 


(8) 


In terms of classical (objective) probability theory, these weighted rules just indicate that in 100 x w percent of cases the 
logical variable b is true if a is true, or equivalently, that in 100 x w percent of cases the random event b occurs when the 
random event a occurs. When we don’t refer to classical probability theory, the weighting factors w and 1 — w indicate 
just the level of conviction committed to the validity of the rules. Although very appealing at the first glance, this prob- 
abilistic model hides actually a strong drawback/weakness specially when dealing with several rules as shown right below. 
Let’s prove first that from a ’probabilized” rule a ee b one cannot assess rigorously the convictions onto its 
Modus Tollens. In other words, from what can we conclude on 


P(alb)=? @) 


From the Bayes’ rule of conditioning (which must hold if the probabilitic model holds), one can express P(a|b) and 
P(a|b) as follows 





P Plar) (ba) " ) 
P(anb P(bla)P(a 
(a|b) l P(b) 1 P(b) 





ee = 1 - P(alb) =1— POY = 1 — PUla) Pla) 
|b) =1 


or equivalently by replacing P(b|a) and P(b|a) by their values w and 1 — w, one gets 


PaB) =1-(1-w) 2S, ii 
P(a|b) = 1 — w% 
P(b) 


These relationships show that one cannot fully derive in theory P(a|b) and P(a\b) because the prior probabilities P(a) 
and P(b) are unknown. 


A simplistic solution, based on the principle of indifference, is then just to assume without solid justification that 
P(a) = P(@) = 1/2 and P(b) = P(b) = 1/2. With such assumption, then one gets the following estimates P(a|b) = w 
and P(a|b) = 1 — w for P(a|b) and P(a|b) respectively and we can go further in the derivations. 


Now let’s go back to our Tweety Penguin Triangle Problem. Based on the probabilistic model (assumed to hold), one 
starts now with both 


=]—e P =€ 
Ti P(f|p)=1-e1 f p “2 1 f 
or ey pe (1) 
P(b|p)=1—€3 P(b|p)=e3 
p > b p => ab 


Note that taking into account our preliminary analysis and accepting the principle of indifference, one has also the two 
sets of weighted rules either 


Pony (ag Mn 

pe Og, pa (12) 
P(p|b)=1—e3 P(p|b)=es 

ab 4 ap b —> =p 


One wants to assess the convictions (assumed to correspond to some conditional probabilities) into the following rules 


P(f|pnb)=? 
= 


pnb Í (13) 


P(f|pnb)=? -f 


pAb (14) 


The question is to derive rigorously P(f|p N b) and P(f|p N b) from all previous available information. It turns out that 
the derivation is impossible without unjustified extra assumption on conditional independence. Indeed, P(f|p N b) and 


P(f|pN b) are given by 





_ P(f,p,b P(p b| f)P(f 
P(f\pab) = Oy ear 


II 


(5) 





F _ P(fpb) _ PLPP) 
P(Fipn b) = K = PPO 
If one assumes as J. Pearl does, that the conditional independence condition also holds, i.e. P(p, bif) = P(plf)POlf) 
and P(p, b|f) = P(p|f)P(b|f), then one gets 


— PalAPOAPy) 
P(flp nb) = BOP 


F POP POAPE. 
P(f|\pnb) = CITOI- (F) 
By accepting again the principle of indifference, P(f) = P(f) = 1/2 and P(p) = P(p) = 1/2, one gets the following 
expressions 
‘5 P(plf)P(b 
P(f|\pnb) = en lf) 
(16) 


P(f\p nb) = PEDT 


Replacing probabilities P(p| f), P(b| f), P(b\p), P(p| f) and P(b| f) by their values in the formula {16}, one finally gets 


A — e1(1-e2) 

P(f|pnb) = = 
(17) 

D(F — U—«s)e2 

PIPRO = 
Therefore we see that, even if one accepts the principle of indifference together with the conditional independence 
assumption, the approximated ”probabilities” remain both small and do not correspond to a real measure of probability 
since the conditional probabilities of exclusive elements f and f do not add up to one. When €1, €2 and €3 tends towards 

0, one has 


P(fipnb) + Ê(fipnb) = 0 


Actually our analysis based on the principle of indifference, the conditional independence assumption and the model pro- 
posed by Judea Pearl, proves clearly the impossibility of the Bayesian reasoning to be applied rigorously on such kind of 
weighted rule-based system, because no probabilistic model exists for describing correctly the problem. This conclusion 
is actually not surprising taking into account the Lewis’ theorem explicated in details in [8] (chapter 11). 


Let’s now explain the reason of the error in the fallacious reasoning which was looking coherent with the common 
intuition. The problem arises directly from the fact that penguin class and bird class are defined in this problem only 
with respect to the flying” and ”not-flying” properties. If one considers only these properties, then none Tweety animal 
can be categorically classified as a penguin-bird, because penguin-birdness doesn’t not hold in reality based on these 
exclusive and exhaustive properties (if we consider only the information given within the rules r1, ro and r3). Actually 
everybody knows that penguins are effectively classified as bird because ’birdness” property is not defined with respect to 
the flying” or ’not-flying” abilities of the animal but by other zoological characteristics C (birds are vertebral oviparous 
animals with hot blood, a beak, feather and anterior members are wings) and such information must be properly taken 
into account in the rule-based systems to avoid to fall in the trap of such fallacious reasoning. The intuition (which seems 
to justify the fallacious reasoning conclusion) for TP2 is actually biased because one already knows that penguins (which 
are truly classified as birds by some other criterions) do not fly in real world and thus we commit a low conviction (which 
is definitely not a probability measure, but rather a belief) to the fact that a penguin-bird can fly. Thus the Pear’ls analysis 
proposed in appears to the authors to be unfortunately incomplete and somehow fallacious. 


4 The Dempster-Shafer reasoning 


As pointed out by Judea Pearl in [12], the Dempster-Shafer reasoning yields, for this problem, a very counter-intuitive 
result: birdness seems to endow Tweety with extra flying power ! We present here our analysis of this problem based on 
the Dempster-Shafer reasoning. 


Let’s examine in detail the available prior information summarized by the rule r1: Penguins normally don’t fly’ = 
(p — ~f) with the conviction wı = 1 — €; where €; is a small positive number close to zero. This information, in the 
DST framework, has to be correctly represented in term of a conditional belief Bel, (f|p) = 1 — € rather than directly 


the mass mı (f N p) = 1 — a1. 


Choosing Bel; (f|p) = 1 — e1 means that there is a high degree of belief that a penguin-animal is also a nonflying- 
animal (whatever kind of animal we are observing). This representation reflects perfectly our prior knowledge while the 
erroneous coarse modeling based on the commitment mı ( f N p) = 1 — € is unable to distinguish between rule rı and 
another (possibly erroneous) rule like r| : (~f — p) having same conviction value w1. This correct model allows us to 
distinguish between rı and r (even if they have the same numerical level of conviction) by considering the two different 
conditional beliefs Bel; (f|p) = 1 — € and Bely (p| f) = 1 — e1. The coarse/inadequate basic belief assignment model- 
ing (if adopted) in contrary would make no distinction between those two rules rı and r} since one would have to take 
m4(f Mp) =m (pN f) and therefore cannot serve as the starting model for the analysis 


Similarly, the prior information relative to rules rə : (b — f) and r3 : (p — b) with convictions wz = 1 — eg and 
w3 = 1 — e3 has to be modeled by the conditional beliefs Belg(f|b) = 1 — €2 and Bels(b|p) = 1 — e3 respectively. 


The first problem we have to face now is the combination of these three prior information characterized by Bel; ( F |p) = 
1 — «1, Bel2( f |b) = 1 — e2 and Belz (b|p) = 1 — es. All the available prior information can be viewed actually as three in- 
dependent bodies of evidence $81, Bz and 63 providing separately the partial knowledges summarized through the values 
of Bel, (f|p), Bel2(f|b) and Bel3(b|p). To achieve the combination, one needs to define complete basic belief assign- 
ments m4(.), m2(.) and m3(.) compatible with the partial conditional beliefs Bel; (f|p) = 1 — «1, Belo(f|b) = 1 — €2 
and Bel3(b|p) = 1 — €3 without introducing extra knowledge. We don’t want to introduce in the derivations some extra- 
information we don’t have in reality. We present in details the justification for the choice of assignment m,(.). The choice 
for m2(.) and ms3(.) will follow similarly. 


The body of evidence Bı provides some information only about f and p through the value of Bel, (f|p) and without 
reference to b. Therefore the frame of discernment ©, induced by B, and satisfying the Shafer’s model (i.e. a finite set of 
exhaustive and exclusive elements) corresponds to 


0, = {91 ê f Np, 02 Ê f AD, 03 = fp,04 = f rp} 


schematically represented by 





The complete basic assignment mı (.) we are searching for and defined over the power set 2°: which must be compatible 
with Bel; (f|p) is actually the result of the Dempster’s combination of an unknown (for now) basic belief assignment 
m’,(.) with the particular assignment m” (.) defined by m//(p = 03 U 04) = 1; in other worlds, one has 


my(.) = [m © mi] 


From now on, we introduce explicitly the conditioning term in our notation to avoid confusion and thus we use m1 (.|p) = 
m4(.|03 U 04) instead m1(.). From m? (p = 03 U 64) = 1 and from any generic unknow basic assignment m/,(.) defined 
by its components m/,(0) £ 0, m; (01), m4 (02), m4 (03), m; (04), m1 (01 U 02), m; (01 U 03), m; (01 U 04), m} (02 U 03), 
mi (02 U 64), mi (03 U 64), mi (0; U Oo U 63), mi (0i U Ao U 04), mi (i U 03 U 04), mi (02 U 03 U 04), mi (A, U Oo U 03 U 04) 
and applying Dempter’s rule, one gets easily the following expressions for m1(.|@3 U 64). All m1(.|@3 U 64) masses are 
zero except theoretically 
1 
n / / 
mı (03|03 U 64) =m, (03 U 04)[m; (03) + m (01 U 63) 
aim mi (02 U 63) 
ms mi (01 U Ag U 63)|/ Ky 





1 
n 1 1 
mı (04|83 U 64) = Mi (03 U 0a )[mi (84) + Mı (0i U 64) 
+ mi (02 U 64) 
“le mi (61 U A U 64)|/ By 





— 
m1(03 U 6403 U 04) = m7 (03 U 04) [m4 (03 U 84) 
+ m; (61 U 83 U 84) 
+ m; (02 U 03 U 64) 
+ m/ (01 U b2 U 03 U 04) / Kı 





with 
1 


A 
ky =1- mi (03 U 64) [m‘,(01) + m; (02) + mi (01 U 2)| 
To complete the derivation of m1 (.|03 U 04), one needs to use the fact that one knows that Bel; (f|p) = 1 — €1 which, 
by definition , is expressed by 
Bel: (f|p) = Bel: (01 U 43|03 U 04) 
Beli (f p) => mı (61|63 U 64) + mı (43|63 U 64) 
+ m1 (64 U 03103 U 04) 


Beli (f|p) = 1 — € 





But from the generic expression of mı (.|93 U 04), one knows also that mı (81|03 U @4) = 0 and mı (01 U 63/63 U 04) = 0. 
Thus the knowledge of Bel; (|p) = 1 — e; implies to have 


m1(63|03 U 04) = [m4 (03) + m; (81 U 63) 
+ mi (02 U 63) 

+ m/ (81 U 62 U 83)]/ Kı 
m,(@3|43 U 84) =1-— <€ 





This is however not sufficient to fully define the values of all components of ™m1(.|03 U 04) or equivalently of all 
components of m/ (.). To complete the derivation without extra unjustified specific information, one needs to apply the 
minimal commitment principle (MCP) which states that one should never give more support to the truth of a proposition 
than justified [9]. According to this principle, we commit a non null value only to the less specific proposition involved 
into mı (03|03 U 04) expression. In other words, the MCP allows us to choose legitimately 


1 (82) = m; (93) = 
mi (01 U 62) = mi (6i U 63) = m' (02 U 63) =0 
mi (01 U bə u 63) Æ 0 


mi (0) =m 


Thus Ay = 1 and mı (03|03 U 04) reduces to 
mı (03103 U 64) = mi) (01 U @2U 63) =1-€, 


Since the sum of basic belief assignments must be one, one must also have for the remaining (uncommitted for now) 
masses of m/,(.) the constraint 


mi (04) + mi (01 U 04) + mi (02 U 04) + m; (01 U 02 U 04) 
+m; (03 U 64) + mi) (04 U@3 U 04) + m' (02 U ĝ3 U 04) 
+m (61 U 2 U 03 U 04) = €j 





By applying a second time the MCP, one chooses m/ (01 U 62 U 83 U 04) = «1. 


Finally, the complete and less specific belief assignment mı(.|p) compatible with the available prior information 
Bel, (f|p) = 1 — €ı provided by the source B, reduces to 


mı (03|03 U 04) =m (0i U 62 U 03) = l— e€ (18) 
mı (03 U 84103 U 84) =m (01 U 62 U 03 U 04) = €1 (19) 


or equivalently 


l— e€ (20) 
JfUpUf)=6 (21) 


It is easy to check, from the mass mı (.|p), that one gets effectively Bel, (f|p) = 1 — €1. Indeed: 


Bel, (f|p) = Bel, (0; U @3|p) 
Beli(f|p) = Beli((f N P) U (F Np) |p) 
Bel: (flp) = mi(f N plp) +m (fN plp) 
0 
+ mi((f Np) U (FNA p)lp) 
0 
Bel (flp) = mı(f N plp) 


Bel: (f|p) =]— €1 


In a similar way, for the source By with O% defined as 


O2 = {91 = f Nb, 02 bN f, 03 f Nb, 04 = f nd} 


schematically represented by 





one looks for m2(.|b) = [m5 p m3](.) with m3 (b) = m3 (03 U04) = 1. From the MCP, the condition Bela (f |b) = 1 — €2 
and with simple algebraic manipulations, one finally gets 


m2(3|43 U 64) = mi (01 U 02 U 63) =1- €2 (22) 
m(63 U 04103 U 04) = mi (0i U A U 03 U 64) = €2 (23) 
or equivalently 
mo(f N b|b) = mh (bU f) =1—e2 (24) 
me(b|b) = m (bU fUbU f) =e (25) 


In a similar way, for the source B3 with Oz defined as 
O; = {91 £ bN p, 02 = bp, 03 = pN b, 04 bN p} 


schematically represented by 





one looks for m3(.|p) = [m5 9 m3] (.) with m3 (p) = m3 (03 U 804) = 1. From the MCP, the condition Belz (b|p) = 1 — €3 
and with simple algebraic manipulations, one finally gets 


m3(03|03 U 04) = m3 (01 U 82 U 03) = 1 — €z (26) 
ms (03 U 04103 U 04) = ms(01 U 02 U 03 U 64) = €3 (27) 
or equivalently 
ma(bN p|p) = m3(pUb) = 1 — e3 (28) 
m3(p|p) = m3(bUpUbUp) = 63 (29) 


Since all the complete prior basic belief assignments are available, one can combine them with the Dempster’s 
rule to summarize all our prior knowledge drawn from our simple rule-based expert system characterized by rules 


R = {r1, r2, r3} and convictions/confidences W = {wy , w2, w3} in these rules. 


The fusion operation requires to primilarily choose the following frame of discernment © (satisfying the Shafer’s 
model) given by 
(S) = {61, b2, 03, 014, 05, 66, 07, Og} 





where 
4.2 fNbNp 6,2 fNbNp 
6,2 fNbND 662 fNbnp 
6,2 fNbNp 67 = FNbNp 
642 fNbND 6g = fnbnp 
The fusion of masses m1(.) given by eqs. (20)-(21) with m2(.) given by eqs. (24}-(25) using the Demspter’s rule of 
combination yields m12(.) = [m1 © mMə](.) with the following non null components 


mio(f NbN p) =ei(1 — €2)/Kiz 
mio(f NbN p) = e€(1 — €&)/Kı2 
Mi2(b N p) = €1€2/Ki2 


with Ky. £1— (1 = €1)(1 = €2) =€, + €2 — €1€2. 


The fusion of all prior knowledge by the Dempster’s rule m123(.) = [m1 ® m2 © ms3](.) = [mi2 @ m3](.) yields the 
final result : 


miz3( f NbN p) = m123(61) = e1(1 — €2)/ K123 
miz3( f NbN p) = m123(65) = €2(1 — €1)/ K123 
m123(bN p) = m123(61 U 05) = €1€2/K123 


with Kiı23 = Kız £ 1— (1 = €1)(1 _ €2) = €] + €2 — €1€2. 


which defines actually and precisely the conditional belief assignment m123(.|p N b). It turns out that the fusion with the 
last basic belief assignment m3(.) brings no change with respect to previous fusion result ™m12/(.) in this particular problem. 


Since we are actually interested to assess the belief that our observed particular penguin-animal named Tweety (de- 
noted as T = (pNb)) can fly, we need to combine all our prior knowledge m123(.) drawn from our rule-based system with 
the belief assignment mo(T = (pM b)) = 1 characterizing the observation about Tweety. Applying again the Demspter’s 
rule, one finally gets the resulting conditional basic belief function mo123 = [Mo © ™123](.) defined by 


Moi23(T = (f NON p)|T = (pNb)) = a(1 — €2)/Kis 
Mor23(T = (F NbN p)|T = (pN b)) = e(1 — €1)/Kiz 
Mo123(T = (bN p)|T = (pn b)) = €1€2/Ky2 


From the Dempster-Shafer reasoning, the belief and plausibity that Tweety can fly are given by 
Bel(T = f|T = (pnb)) = 
5 Mo123(T = 2|T = (pN b)) 


zE29, «Cf 


PUT = f|T = (pN b)) = 


5 Mo123(T = x|T = (p N b)) 
rE2© 2nfAO 


Because f = [(fMbNp)U(f Nb/Np)U(fNbNp) U(f NbMp) and the specific values of the masses defining mo123(-), 
one has 


Bel(T = fIT = (Pnb) = 
Mo123(T = (f NbN p)|T = (pN b)) 


PI(T = f|T = (pnb)) = 
Mo123(T = (f NbN p)|T = (pnd)) 
+ Mo123(T = (bN p)|T = (pN b)) 








and finally 
1— 
Bel(T = f|T = (pnb) = EZ) (30) 
Kız 
e(l — €2) €1€92 E1 
PI(T = f|T = (pA b)) = = + — = — 31 
(T= JIT = pnb) =A ee G1) 
In a similar way, one will get for the belief and the plausibility that Tweety cannot fly 
- _ 
Bel(T = JIT = (prib)) = ŽE =V G2) 
Kız 
Hesene 2 ynm (33) 
Kız K2 Kiz 
Using the first order approximation when €; and €2 are very small positive numbers, one gets finally 
Bel(T = f|T = (pN b)) = PUT = f|T = (pnb)) = — 
€; +€2 
In a similar way, one will get for the belief that Tweety cannot fly 
Bel(T = FIT = (pn b)) = PUT = FIT = (pnb) ~ — 
€1 + €2 


This result coincides with the Judea Pearl’s result but a different analysis and detailed presentation has been done here. 
It turns out that this simple and complete analysis corresponds actually to the ballooning extension and the generalized 
Bayesian theorem proposed by Smets in and discussed by Shafer in although it was carried out independently 
of Smets’ works. As pointed out by Judea Pearl, this result based on DST and the Dempster’s rule of combination looks 
very paradoxical/counter-intuitive since it means that if nonflying birds are very rare, i.e. €2 ~ 0, then penguin-birds like 
our observed penguin-bird Tweety, have a very big chance of flying. As stated by Judea Pearl in pages 448-449: 
”The clash with intuition revolves not around the exact numerical value of Bel( f) but rather around the unacceptable 
Phenomenon that rule r3, stating that penguins are a subclass of birds, plays no role in the analysis. Knowing that Tweety 
is both a penguin and a bird renders Bel(T = f|T = (pN b)) solely a function of m,(.) and ma(.), regardless of 
how penguins and birds are related. This stands contrary to common discourse, where people expect class properties to 
be overridden by properties of more specific subclasses. While in classical logic the three rules in our example would 
yield an unforgivable contradiction, the uncertainties attached to these rules, together with Dempster’s normalization, 
now render them manageable. However, they are managed in the wrong way whenever we interpret if-then rules as 
randomized logical formulas of the material-implication type, instead of statements of conditional probabilities”. Keep 
in mind that this Pearl’s statement is however given to show the semantic clash between the Dempster-Shafer reasoning 
vs. the fallacious Bayesian reasoning to support the Bayesian reasoning approach. 


5 The Dezert-Smarandache reasoning 


Before going further in our analysis, some clarification is necessary to explain to the reader the fundamental difference 
between the foundations of DSmT vs. DST. The DSmT can be easily viewed as a general flexible Bottom-Up approach 
for managing uncertainty and conflicts in fusion problems. It arises from the fact that the conflict between sources of 
evidence can come not only from the reliability of sources themselve (which can be handled quite easily by classical dis- 
counting methods) but also from a different interpretation of elements of the frame just because the sources or evidence 
have only a limited knowlege and provide their beliefs only with respect to their knowledge based usually on their own 
(local) experience, not to mention the fact that elements of the frame of the problem can truly be not refinable at all in 
some cases involving vague concepts like smallness/tallness, pleasure/pain, etc because of the continuous path from one to 
the other, etc. Based on this matter of fact, the DSmT proposes a new mathematical framework which starts at the bottom 
level (solid ground level) from the free DSm model and the notion of hyper-power set (Dedekind’s lattice), then provides 
a general rule of combination to work with the free DSm model. Then it includes the possibility to take into account 
any kind of integrity constraints into the free DSm model if necessary through the hybrid DSm rule of combination. The 
taking into account for an integrity constraint consists just in forcing some elements of the Dedekind’s lattice to be empty, 
just because they truly are for some given problems. 


The introduction of an integrity constraint is like ’pushing an elevator button” for going a bit up in the process of 
managing uncertainty and conflicts. If one needs to go higher, then one can take into account several integrity constraints 
as well in the framework of DSmT. If we finally wants to take into account all possible exclusivity constraints if we know 
that all elements of the frame of the given problem under consideration are truly exclusive, then we go directly to the Top 
level (the Shafer’s model which serves as foundation for the DST). 


DSmT however can handle not only exclusivity constraints, but also existential constraints or mixed constraints as 
well which is helpful for some dynamic fusion problems. It is also important to emphaze that the hybrid DSm rule of 
combination is definitely not equivalent to the Dempster’s rule of combination (and its alternatives based on the Top level) 
because one can stop and work at any level in the process of managing uncertainty and conflicts, depending on the nature 
of the problem. The hybrid DSm rule and Dempster’s rule do not provide same results even if working with the Shafer’s 
model as it will be proved in the sequel. The approach proposed by the DSmT to attack the fusion problem is totally new 
both by its foundations and the solution provided. 


The DSmT has been originally (ground-level) developed for the fusion of uncertain and paradoxical (highly conflict- 
ing) sources of information (bodies of evidences) based on the free DSm model Mf (©) which assumes that none of 
elements of the frame © are exclusive. This model is opposite to the Shafer’s model. Let consider a free DSm model 
M/(®) with © = {61,...,4n}, the DSmT starts with the notion of hyper-power set DÈ defined as the set of all com- 
posite propositions built from elements of © with U and N (© generates D® under operators U and N) operators such that 


1. 0,01,...,On = D®. 
2. If A,B € DË, then AN B € D© and AUB € D®. 
3. No other elements belong to D®, except those obtained by using rules 1 or 2. 


The cardinality of hyper-power set, d(n) £ |D®°| for n > 1, follows the sequence of Dedekind’s numbers 1, 2, 5, 19, 167, 
7580, 7828353, ... More details about the generation and partial ordering of elements of hyper-power set can be found in 
(4) [5] (21). From this model, authors have proposed a new simple associative and commutative rule of combination (the 
DSm classic rule) and then extended this rule to deal with any kind of hybrid models, i.e. sets © for which some proposi- 
tions/elements of D® are known or forced to be empty depending on the nature and the dynamicity of the fusion problem 
under consideration. In this framework, the Shafer’s model appears only as a special hybrid model (the most constrained 
one, if we don’t introduce existential constraints). The hybrid DSm fusion rule covers a wide class of fusion applications 
but is restricted to fusion of precise uncertain and paradoxical information only [21]. We have recently extended this rule 
with new set operators for the fusion of imprecise, uncertain and paradoxical information - see for details. 


We analyze here the Tweety penguin triangle problem with the DSmT. The prior knowledge characterized by the rules 
R = {r1,r2,r3} and convictions W = {w 1, w2,w3} is modeled as three independent sources of evidence defined on 
separate minimal and potentially paradoxical (i.e internal conflicting) frames 0, £ {p, f}, Q2 £ {b, f} and ©; £ {p, b} 
since the rule rı doesn’t refer to the existence of b, the rule rz doesn’t refer to the existence of p and the rule r3 doesn’t 
refer to the existence of f or f. Let’s note that the DSmT doesn’t require the refinement of frames as with DST (see 
previous section). We follow the same analysis as in previous section but now based on our DSm reasoning and the DSm 
rule of combination. 


The first source Bı relative to rı with confidence w; = 1 — € provides us the conditional belief Bel; ( F |p) which 
is now defined from a paradoxical basic belief assignment mı (.) resulting from the DSm combination of m{ (p) = 1 
with m/ (.) defined on the hyper-power set D®! = {0,p, f, p N f, p U f}. The choice for m; (.) results directly from the 
derivation of the DSm rule and the application of the MCP. Indeed, the non null components of m; (.) are given by (we 
introduce explicitly the conditioning term in notation for convenience): 


I 1 


mN m = 
ma (p|p) = mi (p) mi (p) + mi (p) mi (pU f) 
1 1 


mai(pn flp) = mi (p) mi (F) + mi (p) m4 (pa f) 


The information Bel; (f|p) = 1 — €ı implies 


Bel; (Flp) = mı(fl\p) + my, (p N fip) =1-¢« 


Since m1(p|p) + mı (pN f|p) = 1, one has necessarily mı (f|p) = 0 and thus from previous equation m (f A p|p) = 
1 — €, which implies both 


mı(plp) = €1 
— = ee a 
mi(pO fip) = mi (p) mi (f) + mf (p) mi (pn f) 
=m (f) +m PN f) 
=1l-¢« 


Applying the MCP, it results that one must choose 
mi(f)=1—e, and mi(pnf) =0 
The sum of remaining masses of m/ (.) must be then equal to €14, i.e. 
my (p) +m (pUJ) =e 
Applying again the MCP on this last constraint, one gets naturally 
mi(p)=0 and mi(pUf) =e 


Finally the belief assignment m; (.|p) relative to the source B, and compatible with the constraint Bel, (f|p) = 1 — €1, 
holds the same numerical values as within the DST analysis (see eqs. (20)-(21)) and is given by 


mı(pN flp) =1-a1 
m1 (p|p) = €1 





but results here from the DSm combination of the two following assignments (i.e. m1(.) = [m;i ®m4](.) = [m] pmi] (D) 


m(f)=1—e and mi(pUf) =e as 
mi (p) = 1 


In a similarly manner and working on ©2 = {b, f} for source Bz with the condition Bel2(f|b) = 1 — €2, the mass 
mMg(.|b) results from the internal DSm combination of the two following assignments 


ean and m(bU f) =e i 
1 


Similarly and working on O3 = {p, b} for source 63 with the condition Bela (b|p) = 1 — es, the mass m3(.|p) results 
from the internal DSm combination of the two following assignments 


m3(b) =1— and m4(bU p) = €3 (36) 
m3(p) =1 


It can be easily verified that these (less specific) basic belief assignments generates the conditions Bel, ( f lp) =1-«1, 
Bela ( f|b) =1- €2 and Bels(b|p) =] — €3. 


Now let’s examine the result of the fusion of all these masses based on DSmT, i.e by applying the DSm rule of 
combination of the following basic belief assignments 


mi(pN f\p)=1—e, and mi(p|p) = «1 
m(bN fib) = 1 — e2 and ma(b|b) = €2 


m3(pNb|p)=1—e3 and ma(plp) = €s 


Note that these basic belief assignments turn to be identical to those drawn from DST framework analysis done in 
previous section for this specific problem because of integrity constraint f N f = Ø and the MCP, but result actually from 
a slightly different and simpler analysis here drawn from DSmT. So we attack the TP2 with the same information as with 
the analysis based on DST, but we will show that a coherent conclusion can be drawn with DSm reasoning. 





Let’s emphasize now that one has to deal here with the hypotheses/elements p, b, f and f and thus our global frame 
is given by © = {b, p, f, f}. Note that © doesn’t satisfy the Shafer’s model since the elements of © are not all exclusive. 
This is a major difference between the foundations of DSmT with respect to the foundations of DST. But because only 
f and f are truly exclusive, i.e. f N f = 0, we face a simple hybrid DSm model M and thus the hybrid DSm fusion 
must apply rather than the classic DSm rule. We recall briefly here (a complete derivation, justification and examples can 
be found in (21]) the hybrid DSm rule of combination associated to a given hybrid DSm model for k > 2 independent 
sources of information is defined for all A € D® as: 


maacoy(A) Ê 6(A)|$1(A) + S2(A) + S3(4) (37) 


where $(A) is the characteristic emptiness function of the set A, i.e. (A) = 1 if A ¢ Ø (Ø £ {0, Øm} being the set of 
all relatively and absolutely empty elements) and ¢(A) = 0 otherwise, and 


Si(A) ê rD [rx (38) 


k 
S2(A) = 5 [[@ (39) 
ley (1) i=1 
WAN MOET L)] 
k 


S3(A) ê y [mx (40) 
X1,X2,...,XpED? i=1 
(X1UX2U...UXk)=A 
(X1NX2N...NXk)E0 
with U = u(X1) U u(X2) U ... U u(Xp) where u(X) is the union of all singletons 6; that compose X and I, = 
6; U 02 U... U Ôn is the total ignorance defined on the frame © = {01,..., 8n}. For example, if X is a singleton then 
u(X) = X; if X = 0i N A or X = 0, U Ag then u(X) = 01 U fə; if X = (A, N 62) U 03 then u(X) = 0; U A U 03; by 
convention u(Q) = 0. 


The first sum Sı (A) entering in the previous formula corresponds to mass m m+ (o) (4) obtained by the classic DSm 
rule of combination based on the free DSm model M* (i.e. on the free lattice D©). The second sum S2(A) entering in the 
formula of the hybrid DSm rule of combination represents the mass of all relatively and absolutely empty sets which 
is transferred to the total or relative ignorances. The third sum S3(A) entering in the formula of the hybrid DSm rule of 
combination transfers the sum of relatively empty sets to the non-empty sets in the same way as it was calculated 
following the DSm classic rule. 


To apply the DSm hybrid fusion rule formula G7), it is important to note that (pn f)N (bN f)Np = pnbnfnf =V 
because f N f = 0, thus the mass (1 — €)(1 — €2)eg is transferred to the hybrid proposition Hı = (pN f)U (bN f)Up = 
(bN f) Up; similarly (pN f) N (bN f) N (pAb) = pNbN fA f =O because f N f = 0 and therefore its associated mass 
(1 — €1)(1 — €2)(1 — e3) is transferred to the hybrid proposition Hz £ (pN f) U (bN f) U (pN b). No other mass transfer 
is necessary for this Tweety Penguin Triangle Problem and thus we finally get from DSm hybrid fusion formula the 
following result for m123(.|p N b) = [m1 © m2 © m3](.) (where © symbol corresponds here to the DSm fusion operator): 


mı23(Hı|p N b 
mMı23(H2|p N b 


) = (1 — e1 )(1 — €2)€3 

)= 
maiz3(pNbN flpN b) = 

)= 

)= 


(1 ) 
(1 — €1)(1 — €2)(1 — €3) 
i= ( 





€1)€2€3 +(1- €1)e2(1 = €3) 
i(1 = €2)€3 +r i(1 = €2)(1 = €3) 
€1€2€3 + €y€2(1 — €3) 


m423(p N bn fipo b 
mMı23(p AQ b|p N b 





with 





Hı (bN f)Up 
He Ê (pN f) U (bN f) U (pnb) 


It can be easily checked that these masses sum up to 1. After elementary algebraic simplifications, one finally gets for 


the DSm fusion of all available prior information and reintroducing explicitly the conditioning term 


™m23(Ay|p N b 
mı23(H2|p N b 


) = (1 —€1)(1 — €2)€3 

)= 
mız(pNbN flpNb) = 

) 

)= 


(=e) 
(1 — e1)(1 — €2)(1 — e3) 
(1 — ex)e2 





m123(p NbN f|pN b) = e1(1 — €2) 
mı23(p N b|p N b) = e1€2 


We can check all these masses add up to 1 and that this result is fully coherent with the rational intuition specially 
when €3 = 0, because non null components of m123(.|p N b) reduces to 


mı23(Hə|p N b) = (1 — e1)(1 — €2) 
my23(p NbN flp Nb) = (1 — e1)e2 
myi3(p NbN flp Nb) = a (1 — €2) 
mı23(p N blpN b) = e1€2 


which means that from our DSm reasoning there is a strong uncertainty (due to the conflicting rules of our rule-based 
system), when €; and €2 remain small positive numbers, that a penguin-bird animal is either a penguin-nonflying animal 
or a bird-flying animal. The small value €1€2 for m123(p N b|p N b) expresses adequately the fact that we cannot commit 
a strong basic belief assignment only to p N b knowing p N b just because one works on © = {p, b, f, f} and we cannot 
consider the property p N b solely because the ”birdness” or ”penguinness” property endow necessary either the flying or 
non-flying property. 


Therefore the belief that the particular observed penguin-bird animal Tweety (corresponding to the particular mass 
mo(T = (pN b)) = 1) can be easily derived from the DSm fusion of all our prior summarized by m123(.|p N b) and the 
available observation summarized by mo(.) and we get 


Moi23(L = (pnb f)|T = ( ) = 
mo123(T = (pNnbn f)|T = ( )) = €1(1 — €2) 
Mo123(T = (pNb)|T = (pnb)) = 

Mo123(T = H1|T = (pN b)) = 

(pN b)) 


Mo123(T' = H2|T = 





From the DSm reasoning, the belief that Tweety can fly is then given by 


Bel(T = f\T=(pNb))= XO moras(T = 2|T = (pnb)) 
2e€D° «Cf 


Using all the components of mo123(.|T = (p N b)), one directly gets 


Bel(T = f|T = (pN b)) = moia3(T = (f NON p)|T = (pN b)) 


and finally 
Bel(T = f|T = (pNb)) =a (1—e) (41) 


In a similar way, one will get for the belief that Tweety cannot fly 
Bel(T = f|T = (pN b)) = e(1—41) (42) 


So now for both cases the beliefs remain very low which is normal and coherent with analysis done in section B.2] 
Now let’s examine the plausibilities of the ability for Tweety to fly or not to fly. These are given by 


PUT = f|T=(pNb))£ XO mors(T = alT = (pd) 


rED® «nfF 


PUT = f|T=(pNb)£ X mozs(T = alT = (pnd) 


xED® anfF 


which turn to be after elementary algebraic manipulations 


PUT = f|T = (pN b)) = (1 — ea) (43) 


PUT = f|T = (pnb)) = (1-4) (44) 


So we conclude, as expected, that we can’t decide on the ability for Tweety of flying or of not flying, since one has 
[Bel(flp N b), PI(flp N b)] = [ex(1 — €2), (1 — €2)] © [0, 1] 


[Bel(f|p N b), Pl(f|pn b)] = [e2(1 — 1), (1 — €1)] ~ [0,1] 


Note that when setting €, = 0 and eg = 1 (or €; = 1 and e2 = 0), i.e. one forces the full consistency of the initial 
rules-based system, one gets coherent result on the certainty of the ability of Tweety to not fly (or to fly respectively). 


This coherent result (radically different from the one based on Dempster-Shafer reasoning but starting with exactly 
the same available information) comes from the DSm hybrid fusion rule which transfers some parts of the mass of empty 
set m(@) = (1 — e1)(1 — eg)eg + (1 — €1)(1 — €2)(1 — €3) ~ 1 onto propositions Hı and Hy. It is clear however that 
the high value of m(Ø) in this TP2 indicates a high conflicting fusion problem which proves that the TP2 is a truly almost 
impossible problem and the fusion result based on DSmT reasoning allows us to conclude on the true undecidability on 
the ability for Tweety of flying or of not flying. In other words, the fusion based on DSmT can be applied adequately on 
this almost impossible problem and concludes correctly on its undecidability. Another simplistic solution would consist 
to say naturally that the problem has to be considered as an impossible one just because m() > 0.5. 





6 Conclusion 


In this paper we have proposed a deep analysis of the challenging Tweety Penguin Triangle Problem. The analysis proves 
that the Bayesian reasoning cannot be mathematically justified to characterize the problem because the probabilistic 
model doesn’t hold, even with the help of acceptance of the principle of indifference and the conditional independence 
assumption. Any conclusions drawn from such representation of the problem based on a hypothetical probabilistic model 
are based actually on a fallacious Bayesian reasoning. This is a fundamental result. Then one has shown how the 
Dempster-Shafer reasoning manages in what we feel is a wrong way the uncertainty and the conflict in this problem. We 
then proved that the DSmT can deal properly with this problem and provides a well-founded and reasonable conclusion 
about the undecidability of its solution. 
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